Search Result

Select

Human interaction recognition based on RGB and skeleton data fusion model

JI Xiaofei, QIN Linlin, WANG Yangyang

Journal of Computer Applications 2019, 39 (11): 3349-3354. DOI: 10.11772/j.issn.1001-9081.2019040633

Abstract （474）

PDF （993KB）（344）

Save

In recent years, significant progress has been made in human interaction recognition based on RGB video sequences. Due to its lack of depth information, it cannot obtain accurate recognition results for complex interactions. The depth sensors (such as Microsoft Kinect) can effectively improve the tracking accuracy of the joint points of the whole body and obtain three-dimensional data that can accurately track the movement and changes of the human body. According to the respective characteristics of RGB and joint point data, a convolutional neural network structure model based on RGB and joint point data dual-stream information fusion was proposed. Firstly, the region of interest of the RGB video in the time domain was obtained by using the Vibe algorithm, and the key frames were extracted and mapped to the RGB space to obtain the spatial-temporal map representing the video information. The map was sent to the convolutional neural network to extract features. Then, a vector was constructed in each frame of the joint point sequence to extract the Cosine Distance (CD) and Normalized Magnitude (NM) features. The cosine distance and the characteristics of the joint nodes in each frame were connected in time order of the joint point sequence, and were fed into the convolutional neural network to learn more advanced temporal features. Finally, the softmax recognition probability matrixes of the two information sources were fused to obtain the final recognition result. The experimental results show that combining RGB video information with joint point information can effectively improve the recognition result of human interaction behavior, and achieves 92.55% and 80.09% recognition rate on the international public SBU Kinect interaction database and NTU RGB+D database respectively, verifying the effectiveness of the proposed model for the identification of interaction behaviour between two people.

Reference | Related Articles | Metrics

Select

Human interaction recognition based on statistical features of key frame feature library

JI Xiaofei, ZUO Xinmeng

Journal of Computer Applications 2016, 36 (8): 2287-2291. DOI: 10.11772/j.issn.1001-9081.2016.08.2287

Abstract （389）

PDF （765KB）（345）

Save

Some issues such as high computational complexity and low recognition accuracy still exist in human interaction recognition. In order to solve those problems, an innovative and effective method based on statistical features of key frame feature library was proposed. Firstly, features of global GIST and regional Histogram of Oriented Gradient (HOG) were extracted from the pre-processed videos. Secondly, training videos with different kind of actions were clustered by the k-means algorithm respectively to get key frame feature of each action for constructing key frame feature library; in addition, similarity measure was utilized to calculate the frequency of different key frames in every interactive video, then the statistical histogram representation of interactive videos were obtained. Finally, the decision level fusion was achieved by using Support Vector Machine (SVM) classifier based on histogram intersection kernel to obtain impressive results on UT-interaction dataset. The experimental results on standard database show that the correct recognition rate of the proposed method is 85%, which indicates that the proposed method is simple and effective.

Reference | Related Articles | Metrics

Select

Two-person interaction recognition based on improved spatio-temporal interest points

WANG Peiyao, CAO Jiangtao, JI Xiaofei

Journal of Computer Applications 2016, 36 (10): 2875-2879. DOI: 10.11772/j.issn.1001-9081.2016.10.2875

Abstract （387）

PDF （972KB）（411）

Save

Concerning the problem of unsatisfactory feature extraction and low recognition rate caused by redundant words in clustering dictionary in the practical monitoring video for two-person interaction recognition, a Bag Of Word (BOW) model based on improved Spatio-Temporal Interest Point (STIP) feature was proposed. First of all, foreground movement area of interaction was detected in the image sequences by the intractability method of information entropy, then the STIPs were extracted and described by 3-Dimensional Scale-Invariant Feature Transform (3D-SIFT) descriptor in detected area to improve the accuracy of the detection of interest points. Second, the BOW model was built by using the improved Fuzzy C-Means (FCM) clustering method to get the dictionary, and the representation of the training video was obtained based on dictionary projection. Finally, the nearest neighbor classification method was chosen for the two-person interaction recognition. Experimental results showed that compared with the recent STIPs feature algorithm, the improved method with intractability detection achieved 91.7% of recognition rate. The simulation results demonstrate that the intractability detection method combined with improved BOW model can greatly improve the accuracy of two-person interaction recognition, and it is suitable for dynamic background.

Reference | Related Articles | Metrics

Select

Design and implementation of multi-target detection and recognition algorithm for optical remote sensing image

JI Xiaofei, QIN Ningli

Journal of Computer Applications 2015, 35 (11): 3302-3307. DOI: 10.11772/j.issn.1001-9081.2015.11.3302

Abstract （553）

PDF （936KB）（509）

Save

At present, the processing and analysis of optical remote sensing images is mostly concentrated on the detection and recognition of single target. The multi-target detection and recognition has become a very important research topic. A multi-target detection and recognition algorithm for optical remote sensing images was proposed. Firstly, the adaptive threshold algorithm was used for target detection and segmentation quickly. Then, a hierarchical BoF-SIFT (Bag of Feature-Scale Invariant Feature Transform) feature was constructed by effectively combing the image pyramid and BoF-SIFT feature to represent the global features and local features of the target and describe the distribution characteristics of the target in detail. Finally, the support vector machine based on radial basis function was used as weak classifier for AdaBoost algorithm. Then a strong classifier was obtained to complete classification and recognition of the target image after continuous updating weights. The recognition rate reached 93.52%. A large number of experimental results show that the proposed algorithm has a notable effect on the segmentation of multi-class target in remote sensing images. The feature is selected appropriately and the recognition method is fast and effective.

Reference | Related Articles | Metrics